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Two  experiments  ut»re  performed  to  determine  the  extent  to  which  individual 
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expennent,  several  published  tests  of  creativity  were  used  as  predictors  of 
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of  hypothesis  generation  performance.  In  a second  experiment,  measures  of 
achievement,  general  mental  ability,  and  information  were  included  with 
Alternate  Uses  as  predictors  of  performance.  Again  Alternate  Uses  was  the  best 
predictor  of  performance.  Several  variants  of  th^^Atternate  Uses  test  were 
also  employed  to  isolate  the  components  of  hypothesis  generation.  It  was  fou.id 
that  two  components  were  involved:  retrieval  of  implicit  dimensions  of  the 
objects  and  retrieval  of  uses  when  the  dimensions  are  explicitly  provided.  The 
latter  component  was  found  to  be  by  far  the  most  important.  It  was  concluded 
that  good  hypothsis  generators  have  skills  that  enable  them  to  effectively 
retrieve  information  stored  in  memory.  x 
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Predicting  Individual  Differences  in  Hypothesis  Generation 


Hypothesis  generation  is  the  cognitive  process  by  which  alternative  expla- 
nations or  hypotheses  are  created  to  account  for  infornation  or  data  and  is  a 
still  that  has  received  little  attention.  A physician,  for  txanple,  should 
generate  plausible  disease  possibilities  or  diagnoses  fror»  the  Many  diseases 
that  a patient  with  a particular  synpton  coup lex  night  have.  An  average 
autonobile  driver  nay  also  generate  a nunber  of  hypotheses  which  nay  explain 
why  his  car  nalfunctions.  Hypothesis  generation  can  be  an  inportant  cognitive 
process  whenever  the  decision  naker  is  faced  with  uncertain  or  equivocal  data 
where  nultiple  hypotheses  are  possible.  In  this  situation  it  is  desirable  that 
a decision  naker  be  able  to  generate  a hypothesis  set  which  includes  the  true 
hypothesis. 

There  are  at  lease  two  inportant  phases  in  the  hypothesis  generation  process. 
Hypotheses  are  rarely  created  "de  novo",  rather  they  are  usually  retrieved  fron 
nenory.  Furthernore,  hypotheses  are  also  assessed  for  plausibility  after 
retrieval,  and  inplausible  hypotheses  are  discarded.  The  decision  naker  then 
uses  the  set  of  plausible  hypotheses  as  an  input  to  the  actual  decision 
process. 

A nodel  of  hypothesis  generation  has  been  proposed  that  postulates  a 
nulti-stage  generation  process  that  is  controlled  by  an  executive  process 
(Gettys  and  Fisher,  in  press).  According  to  this  nodel,  the  executive  process 
initiates  a nenory  search  to  retrieve  possible  explanations  for  a set  of  data. 


Hypotheses  way  be  retrieved  through  direct  associationni  linkages  with  the  data 
or  nay  be  retrieved  through  indirect  or  mediated  linkages.  Once  a hypothesis 
has  been  retrieved,  it  is  checked  for  consistency  with  respect  to  any  renaming 
data.  Hypotheses  that  survive  this  initial  logical  consistency  check  nay  be 
subjected  to  further  processing. 

Uhen  one  or  nore  hypotheses  have  been  retrieved,  the  executive  process  nay 
transfer  control  to  a process  called  plausibility  assessnent.  This  process 
consists  of  a nore  thorough  exanmation  of  the  hypothesis  than  is  provided  by 
consistency  checking.  The  plausibility  of  an  individual  hypothesis  is  assessed 
to  deternine  if  it  is  a plausible  explanation  for  the  data.  If  a hypothesis  is 
found  to  be  plausible,  it  is  added  to  those  hypotheses  that  the  decision  naker 
is  currently  entertaining.  This  collection  of  hypotheses  is  called  the 
“current  hypothesis  set".  The  decision  to  continue  the  memory  search  process, 
or  terninate  the  search  is  based  on  an  assessment  of  the  current  hypothesis 
set.  If  the  current  hypothesis  set  is  believed  to  be  incomplete,  the  search 
continues. 

Gettys,  Fisher  and  liehle  (1??8),  and  Gettys  and  Fisher  (in  press)  have  shown 
that  although  hypothesis  generation  performance  is  generally  poor,  there  were 
large  individual  differences  among  decision  makers.  Hany  decision  makers  are 
poor  hypothesis  generators  while  others  are  fairly  proficient,  therefore  it 
should  be  worthwhile  to  develop  measures  to  predict  the  hypothesis  generation 
performance  of  an  individual.  Furthermore,  an  understanding  of  the  cognitive 
skills  necessary  for  proficient  hypothesis  generation  might  allow  remedial 
training  for  those  who  are  deficient  in  these  skills. 


T mo  expermtnts  were  conducted  to  pursue  these  goals.  The  first  experiment 
examined  the  utility  of  published  tests  of  creativity  in  the  application  of 
predicting  hypothesis  generation  performance.  A second  experiment  refined  the 
most  promising  of  these  measures,  and  examined  several  additional  possible 
predictors  of  hypothesis  generation. 

For  the  first  experiment,  a survey  was  made  of  published  creativity  tests  to 
identify  those  that  might  be  used  as  predictors  of  hypothesis  generation 
performance.  Certain  types  of  creative  thought  seem  to  share  many  common 

characteristics  with  good  hypothesis  generation.  Ue  were  interested  in 

examining  the  type  of  creativity  often  attributed  to  scientists  and  other 

problem  solvers  rather  than  artistic  creativity.  Uhile  the  latter  type  of 
creativity  is  often  characterised  by  uniqueness  of  thought  and  response,  the 
former  requires  that  the  thought  also  be  productive  or  useful  (Nednick,  1962). 
Scientific  creativity  is  similar  to  hypothesis  generation  in  that  both 

processes  use  convergent  and  divergent  thinking  to  attain  a solution  to  a 
problem.  Scientists  and  other  hypothesis  generators  use  clues  provided  by 
available  data  to  retrieve  hypotheses  from  memory  using  divergent  thinking. 
Uhen  hypotheses  are  retrieved  from  memory  they  are  assessed  for  both 
consistency  and  plausibility  using  convergent  thinking,  and  hypotheses  must 
meet  both  criteria  to  be  acceptable.  Artistic  creativity,  on  the  other  hand, 
operates  in  a less  constrained  environment — i t involves  divergent  thought  that 
leads  to  many  solutions  or  ideas. 

For  these  reasons  ue  chose  four  tests  of  creativity  which  we  believed  captured 
some  of  the  essence  of  unusual  but  productive  thought.  The  Alternate  Uses  test 
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(Christensen,  Guilford,  HernfielJ,  and  Uilson,  1960)  was  chosen  as  a predictor 
because  it  required  subjects  to  think  of  uses  other  than  the  ordinary  one  for  a 
connon  object,  while  allowing  creative  people  to  produce  productive,  logical 
uses  rather  than  fantastic  or  unreasonable  ones.  The  Renote  Associations  Test 
(Hednick  and  Nedmck,  1967)  was  also  used  because  subjects  who  do  well  on  this 
test  can  recognize  nonobvious  renote  semantic  relationships.  This  is  a quality 
that  should  also  be  a component  of  good  hypothesis  generation.  A subtest  of 
the  AC  Test  of  Creative  Ability,  simlar  to  Alternate  Uses,  which  ue  called 
"Possible  Reasons"  (AC  Spark.  Plug  Division,  General  Motors  Corp.,  1953),  was 
used  because  it  was  quite  sinilar  to  a hypothesis  generation  task;  subjects 
were  provided  with  a description  of  a correlational  relationship  between 
objects  or  occurrences  (for  exanple,  “corn  and  tonatoes  grow  better  if  they  are 
planted  in  the  sane  field  than  if  they  are  planted  separately")  and  were 
required  to  think  of  all  possible  reasons  to  explain  why  this  relationship 
night  exist. 

Another  test  called  "Cards"  was  developed  to  tap  the  subjects''  divergent 
inductive  reasoning  capabilities  as  divergent  induction  could  also  be  a 
conponent  of  hypothesis  generation.  In  this  task,  subjects  were  asked  to 
discover  all  possible  rules  that  night  have  been  used  to  generate  a sequence  of 
four  playing  cards. 

Two  types  of  tasks  Measuring  hypothesis  generation  ability  were  used  as 
criterion  variables.  Both. tasks  required  subjects  to  generate  all  possible 
hypotheses  that  were  consistent  with  the  data  provided  by  the  experinenter.  In 
one  task,  the  Hypothesis  Generation  task,  the  data  were  characteristics  of 
States  of  the  Union,  aniMals,  occupations  or  acadeMic  Majors.  Subjects  were 


asked  to  generate  as  many  possible  instances  for  each  category  that  were 


consistent  with  all  of  the  data.  In  a second  task,  the  Geography  task, 
subjects  were  provided  with  naps  of  four  existing  geographical  locations  os 
well  as  three  pieces  of  additional  information  about  each  location  and  were 
asked  to  think  of  all  possible  identities  for  each  location.  The  process  of 
using  the  available  data  to  infer  possible  hypotheses  about  the  way  the  land 
areas  were  utilized  was  believed  to  be  a more  realistic  approximation  of  the 
hypothesis  generation  process  used  in  everyday  life.  To  some  degree  the 
Geography  task  simulates  the  cognitive  processing  used  by  intelligence  analysts 
or  photo  interpreters.  Both  tasks  require  the  subject  to  retrieve  possible 
hypotheses  from  memory,  and  examine  them  to  insure  that  they  are  consistent 
with  all  of  the  data. 

In  addition,  it  seems  reasonable  that  the  amount  of  information  available  in 
memory  would  influence  the  ability  to  generate  hypotheses;  consequently  we 
included  a task  to  measure  the  amount  of  specific  information  possessed  by  a 
subject  about  the  particular  Geography  and  Hypothesis  Generation  problems.  The 
amount  of  information  known  by  subjects  was  roughly  assessed  by  asking  the 
subjects  to  rate  the  plausibility  of  good,  medium,  and  poor  hypotheses  for  the 
particular  problems  encountered  earlier.  They  were  also  allowed  the 
opportunity  to  choose  up  to  three  hypotheses  from  the  available  list  to  add  to 
their  original  hypothesis  set.  Ue  hypothesized  that  if  subjects  have 
considerable  information  and  are  creative,  they  should  do  well  on  the  criterion 
tasks.  If  they  have  considerable  information  and  moderate  to  low  creative 
abilities,  they  should  do  moderately  well  on  the  criterion  tasks,  but  not  as 
well  as  those  who  have  high  creative  abilities.  However,  creative  potential 
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should  be  of  little  use  i n the  hypothesis  generation  tasks  for  subjects  having 
little  information. 


Experiment  1 


Method 

Sifcjecis.  The  99  subjects  who  participated  in  Experiment  1 were  introductory 
psychology  students  who  received  class  credit  for  their  participation.  Data 
from  two  additional  subjects  were  discarded  because  one  subject  did  not  return 
to  the  second  session,  while  the  other  did  not  understand  the  instructions  for 
one  of  the  tests. 


Details  of  the  tests  in  the  test  battery.  There  were  four  categories  of 
tests  administered  to  the  subjects.  These  categories  included  J)  Criterion 
measures  of  hypothesis  generation  performance,  2)  Tests  of  creativity,  3)  An 
inductive  reasoning  task,  and  t)  Tests  of  information.  These  tests  are 
described  below. 


1)  The  two  criterion  tests  were  the  "Hypothesis  Generation"  test  and  the 
"Geography"  test,  which  uere  described  earlier. 

On  the  Hypothesis  Generation  test,  subjects  were  asked  to  list  as  many 
hypotheses  as  possible  in  response  to  an  item  question  such  as  "List  as  nany 
States  as  you  can  that  are  noted  for  the  following  products  or  industries:  A. 
Beef,  B.  Fish,  and  C.  Aerospace  Industry."  The  hypothesis  generation  test 
consisted  of  eight  items.  Two  of  the  items  contained  hypotheses  of  states 
noted  for  products  and  industries;  two  items  contained  hypotheses  of  animals 
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based  on  their  physical  characteristics,  two  of  the  itens  contained  hypotheses 
of  skilled  occupations  based  on  tradesnan's  tools,  and  two  of  the  itens 
contained  hypotheses  of  possible  academe  Majors  at  the  University  of  Oklahoma 
based  on  classes  that  an  OU  student  had  taken. 

The  Geography  test  consisted  of  a nap  and  three  additional  pieces  of 
infornation.  The  task  of  the  subject  was  to  generate  possible  hypotheses  for 
an  unidentified  area  on  the  nap  that  were  consistent  with  the  nap  and  the 
additional  infornation.  An  exanple  Geography  problen  is  shown  in  the  nethod  of 
Experinent  2.  The  Geography  test  consisted  of  four  problens. 

Both  of  the  criterion  hypothesis  generation  tasks  were  scored  sinilarly  by 
first  collecting  all  hypotheses  generated  by  the  subjects.  A lean  of  two 
experineters  then  rated  each  hypothesis  for  consistency.  For  the  "Hypothesis 
Generation"  task,  responses  were  rated  either  'consistent'  or  'inconsistent' 
and  the  corresponding  hypothesis  was  then  given  either  1 or  0 points,  depending 
on  its  assigned  rating.  For  the  "Geography"  task,  the  responses  were  rated 
using  a three  point  scale.  Hypotheses  that  were  consistent  with  all  of  the 
available  data  were  given  2 points,  hypotheses  that  were  consistent  with  all 
but  one  piece  of  data  were  given  1 point,  and  hypotheses  that  were  inconsistent 
with  nore  than  one  piece  of  data  were  given  zero  points.  Each  hypothesis 
generated  by  a subject  was  scored  using  one  of  the  scales  described  above.  The 
subject  then  received  one  score  for  each  of  the  two  criterion  tasks  that 
consisted  of  the  sun  of  the  points  earned  for  each  hypothesis  generated  on  each 
problen  of  the  two  tasks. 

2)  The  creativity  measures  were  the  Alternate  Uses  Test,  the  Renote  Assoc iatons 
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test,  und  the  "Possible  Reasons*'  test. 

On  an  Alternate  Uses  problem,  subjects  attempted  to  think  of  oil  practical  uses 
for  a common  household  object  <i.e.  a safety  pin)  that  were  not  the  usual  use 
of  that  object  (i.e.  pinning  together  pieces  of  cloth).  There  were  five 
problems  of  this  type  on  the  Alternate  Uses  test.  The  uses  generated  by 
subjects  were  independently  scored  by  two  experimenters  who  assigned  one  point 
for  each  appropriate  answer.  An  inter-rater  reliability  coefficient  was 
obtained  to  check  the  consistency  of  evaluations  of  the  scorers. 

Each  problem  on  the  Remote  Associations  Test  consisted  of  three  adjectives 
which  were  all  remotely  associated  with  a single  noun.  For  example,  "blue", 
“rat",  and  "cottage”  are  all  associated  with  the  noun  “cheese".  Subjects  were 
required  to  write  down  a single  response  for  each  problem  that  was  associated 
with  all  of  the  adjectives.  Fifteen  of  these  problems  were  included  on  the 
Renote  Associations  Test.  A list  of  the  correct  responses  was  available  for 
the  RAT.  A subject's  score  on  this  test  consisted  of  the  total  number  of 
correct  responses. 

Each  problem  on  the  "Possible  Reasons"  test  consisted  of  a statement  proposed 
as  fact.  An  example  problen  follows:  "Babies  born  in  the  months  of  October  and 
November  have  better  bones,  on  the  average,  than  those  born  in  the  other  ten 
months  of  the  year."  Subjects  were  asked  to  write  down  as  many  explanations 
for  each  statement  as  they  could.  Five  such  problems  were  included  on  this 
test.  Two  experimenters  independently  scored  each  response.  Two  points  were 
given  for  a good  response,  that  is,  one  that  was  highly  plausible.  One  point 
was  given  for  a fairly  plausible  response,  and  no  points  were  given  for 


i tto  tfM 


■ 


10 


The  first  Mas  to  rate  the  quality  of  the* hypotheses  using  a five  point  scale. 

If  the  subject  gave  a hypothesis  5 points,  it  was  considered  to  be  "good”, 
while  a rating  of  1 was  considered  to  be  ‘'bad".  The  second  task,  of  the 
subjects  was  to  conpare  our  proposed  hypotheses  to  the  list  of  hypotheses  they 
had  generated  earlier.  If  they  wished  to  add  any  of  our  hypotheses  to  their 
list,  they  could  indicate  their  desire  to  do  so  by  making  a check  nark  in  the 

I ■ 

| j 

box  located  next  to  the  particular  hypothesis. 

This  test  was  scored  by  one  experimenter  who  computed  one  score  for  each 
subject.  The  Infornation  score  was  conputed  by  taking  the  difference  between 
the  sun  of  the  points  assigned  by  the  subject  to  the  "good"  hypotheses  and  the 
sun  of  the  points  assigned  by  the  subject  to  the  "bad"  hypotheses.  The  best 
score  a subject  could  get  using  this  method  was  12  points;  15  points  for 
assigning  a "5"  to  each  of  the  "good"  hypotheses  minus  3 points  for  assigning  a 
"I"  to  each  of  the  “bad"  hypotheses.  The  worst  score  was  -12  points.  A 
subject's  score  on  this  part  would  be  the  sum  of  the  points  for  each  of  the 
eight  problems. 

The  Geography  Information  task  closely  resembled  the  Hypothesis  Generation 
Infornation  task.  In  this  task,  subjects  also  rated  the  quality  of  nine 
hypotheses  provided  by  the  experimenters,  and  indicated  whether  they  would  like 
to  add  any  of  those  hypotheses  to  the  list  of  hypotheses  they  had  generated 
earlier.  There  were  four  of  this  type  of  problem,  each  one  corresponding  to  a 
particular  Geography  problem  found  in  the  criterion  task. 


The  scoring  methods  used  for  this  task  were  identical  to  those  described  for 
the  Hypothesis  Generation  Infornation  task.  In  addition,  another  three  scores 


were  confut'd  for  each  subject.  These  scores  were  designed  to  measure  whether 
the  subject  realized  that  the  "ground  truth"  hypotheses  were  good.  The  nap  for 
each  Geography  problem  cane  from  U.S.  Census  Tracts  so  that  each  problem  had  a 
"right*  answer.  Subjects  received  one  point  for  a “Generation"  score  if  they 
generated  that  right  answer.  They  received  one  point  for  a "Recognition"  score 
if  they  did  not  generate  the  right  answer  but  recognized  that  it  was  a good 
answer.  They  received  one  point  for  a "Neither*  score  if  they  neither 
•generated  nor  recognized  the  right  answer.  For  each  problem,  then,  subjects 
could  receive  only  one  point  to  be  assigned  to  either  the  Generation, 
Recognition,  or  Neither  category.  Overall,  the  subject's  Generation, 
Recognition,  and  Neither  scores  reflected  the  number  of  times  they  had  done 
each  of  those  activities  over  the  entire  set  of  Geography  Information  problems. 


Itlrtioa  £CQC$dyCS*  Subjects  were  tested  in  groups  of  about  20  during  two 
one-hour  sessions.  The  tasks  were  administered  in  the  following  order: 
Session  I:  Geography  Problems  <15  minutes),  Alternate  Uses  Test  <10  minutes), 
Remote  Associations  Test  (15  Minutes),  Geographical  Information  Test  (10 
minutes).  Session  2:  Hypothesis  Generation  Task  (17  minutes),  "Cards" 

Inductive  Reasoning  Test  <10  minutes),  "Possible  Reasons"  (10  minutes), 
Hypothesis  Generation  Information  Test  (12  minutes). 

RwkiHs  UiKUMiQB 

GacctlvUam  owaitta  ‘iB'-i  twratbvtii  atwcsiiQo 

C4CfyC(5s)Q£?*  A correlational  analysis  was  performed  to  determine  the 
relationship  between  tests  of  creativity  and  hypothesis  generation  performance. 
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The  Matrix  of  correlations  between  the  predictor  and  the  criterion  variables  is 
shown  in  table  1. 


(Insert  table  1 about  here) 


Two  of  the  tests  of  creativity  were  noticeably  related  to  hypothesis  generation 
performance.  The  Alternate  Uses  test  correlated  significantly  with  both 
hypothesis  generation  tasks.  The  correlation  between  Alternate  Uses  and  the 
hypothesis  generation  task  was  .33  <p<  .001)  and  the  correlation  between 
Alternate  uses  and  the  Geography  task  was  .27  (p<,006).  The  Possible  Reasons 
test  correlated  with  the  Geography  task  (r~.29,  p<.004),  but  did  not  correlate 
well  with  the  hypothesis  generation  task. 


A second  analysis  was  performed  to  examine  the  predictive  relationship  between 
the  creativity  tests  and  an  equally  weighted  composite  of  the  Hypothesis 
Generation  test  and  the  Geography  test.  Here  also  the  Alternate  Uses  test  was 
the  best  predictor  (r  = .34,  p < .0001)  and  the  Possible  Reasons  test  was 
noticeably  inferior  (r  = .19,  p > .05). 

Predicting  hypothesis  generation  performance  using  multiple  regression.  A 
Multiple  correlation  was  calculated  to  deternine  the  extent  to  which  all  of  the 
predictor  variables  could  predict  hypothesis  generation  performance  as  Measured 
by  the  composite  score.  The  Multiple  correlation  was  .477  which  was 
significantly  different  from  0,  <F  = 4.63,  p < .001).  Ue  then  compared  this 
nodel  to  a reduced  model  formed  by  eliminating  predictor  variables  that  were 
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INTERCORRELATION  MATRIX  FOR  EXPERIMENT  I 


AU 

CARDS 

PR 

RAT 

GI 

HGI 

GE 

AU 

1.0 

.30 

.28 

.18 

.10 

-.05 

.27 

CARDS 

.30 

1.0 

.29 

.19 

.13 

.09 

.14 

PR 

.28 

.2? 

1.0 

.25 

.14 

.03 

.29 

RAT 

.18 

.19 

.25 

1.0 

.26 

.12 

.21 

61 

.10 

.13 

.14 

.26 

1.0 

.16 

.05 

H6I 

-.05 

.09 

.03 

.12 

.16 

1.0 

.08 

GE 

.27 

.14 

.29 

.21 

.05 

.08 

1.0 

H6 

.33 

.02 

.06 

.12 

-.02 

.11 

.12 

AU=ALTERNATE  USES  TEST 

CARDS3CARD  TASK 

PR=POSSIBLE  REASONS  TEST 

RAT*REMOTE  ASSOCIATIONS  TEST 

G I "GEOGRAPHICAL  INFORMATION  TEST 

H6I»HYPOTHESIS  GENERATION  INFORMATION  TEST 

GE3GEODRAPHY  TEST 


HG=HYPOTHESIS  GENERATION  TEST 
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weakly  related  to  the  criterion  score.  Ue  found  that  a reduced  model  containing 
only  the  Alternate  Uses  test  was  statistically  indistinguishable  from  the  full 
nodel  containing  all  the  predictor  variables.  This  result  suggests  only  the 
Alternate  Uses  test  need  be  enployed  as  a predictor  of  hypothesis  generation, 
and  that  this  variable  is  by  far  the  best  predictor  of  hypothesis  generation 
performance  of  the  six  variables  studied. 

The  conclusions  that  can  be  drawn  from  this  series  of  analyses  are  simple,  and 
soneuhat  negative.  First,  only  the  Alternate  Uses  test,  among  the  three 
creativity  tests,  predicts  hypothesis  generation  performance.  Secondly,  our 
measure  of  the  subjects-*  information  was  not  a good  predictor  of  hypothesis 
generation  performance.  This  result  suggests  that  being  able  to  categorize 
hypotheses  in  terms  of  their  consistency  with  the  data  is  not  related  to 
hypothesis  generation  performance  as  measured  by  our  two  criterion  measures. 
This  result  is  consistent  with  the  explanation  that  the  major  difficulty  that 
subjects  have  in  hypotheses  generation  is  memory  retrieval,  and  that  the 
ability  to  assess  hypotheses  is  not  nearly  as  potent  a predictor  of  hypothesis 
generation  performance  as  the  ability  to  retrieve  hypotheses  from  memory.  This 
effect  also  could  be  due  to  the  low  difficulty  of  the  information  task.  The 
task  required  that  subjects  respond  to  hypotheses  that  differed  grossly  in 
respect  to  their  suitability;  the  subjects  may  have  performed  fairly 
consistently  on  this  task,  and  differences  between  subjects  may  have  been  due 
to  characteristics  such  as  motivation  and  care  in  responding  accurately  to  all 
the  questions. 


Thirdly,  the  "Cards"  test  which  was  designed  to  measure  divergent  inductive 


reasoning  capabilities  did  not  correlate  noticeably  with  either  of  the 
criterion  tasks.  Here  also  individual  differences  in  retrieval  from  memory  nay 
have  been  far  wore  inportant  than  inductive  reasoning. 

The  largely  negative  results  of  Experiment  1r  where  only  the  Alternate  Uses 
test  seemed  to  be  related  to  hypothesis  generation,  may  be  due  to  a number  of 
factors.  First,  Experiment  1 was  designed  to  survey  a large  number  of 
predictors,  and  we  were  not  able  to  use  a large  number  of  items  in  each 
predictor  for  reasons  of  time.  Therefore,  the  correlations  which  were  obtained 
may  have  been  reduced  by  the  unreliability  of  the  limited  measures  we  obtained. 
Second,  ue  noticed  that  the  inter-item  reliabilities  in  our  criterion  tasks 
were  low,  suggesting  that  the  items  were  not  necessarily  measuring  the  same 
abilities.  Consequently,  we  decided  to  pick  the  most  promising  of  the  predictor 
variables  and  study  it  in  more  detail  while  simultaneously  improving  the 
reliability  of  the  criterion  measure. 

Experiment  2 

The  Alternate  Uses  test  was  found  to  be  the  best  predictor  of  hypothesis 
generation  performance  in  Experiment  1.  There  may  be  several  component  skills 
which  influence  performance  on  this  test  that  are  also  important  in  hypothesis 
generation.  Therefore,  Experiment  2 was  conducted  partially  to  identify  the 
relative  importance  of  these  components. 

Consider  the  processes  which  might  be  used  to  generate  alternate  uses  for  a 
safety  pin.  A safety  pin  is  a physical  object  which  can  be  characterized  along 
a number  of  dimensions,  such  as  it  is  spring  steel,  it  is  sharp,  it  conducts 
electriciy,  and  it  is  fire  resistent.  The  fact  that  it  is  sharp  and  can  be 


sterilized  in  a natch  flane  surest*  that  it  can  he  used  for  ninor  surgery, 
•such  as  for  renoving  a splinter.  It  can  serve  as  a Makeshift  fishhook  because 
it  is  spring  steel,  is  sharp,  and  is  hook- shaped.  These  dinensions  nay 
therefore  serve  as  inplicit  retrieval  cues  for  a nenory  search,  and  the  ability 
to  recall  dinensions  nay  be  one  conponent  which  is  inportant  for  this  task. 


Retrieval  of  uses  fron  nenory  is  a second  conponent  that  logically  mist  be 
involved  in  hypothesis  generation.  Previous  research  (Gettys  and  Fisher,  in 
press;  Gettys,  Fisher  and  Hehle,  1978)  has  identified  this  ability  as  being 
critically  inportant.  In  the  Alternate  Uses  test,  the  subject  mist  be  able  to 
nake  a thorough  search  of  nenory  in  order  to  retrieve  the  alternate  uses.  This 
search  nay  or  nay  not  be  based  on  the  inplicit  retrieval  cues  that  an  analysis 
of  an  object  by  its  physical  dinensions  provides. 

Hypothesis  generation  nay  also  involve  the  sane  two  conponents  1)  retrieval  of 
properties  or  characteristics  of  sone  object  or  entity,  and  2)  searching  nenory 
for  hypotheses.  In  the  Geography  task  naps  are  provided  which  identify  an  area 
that  is  surrounded  by  an  unknown  area.  In  generating  hypotheses  for  the 

unknown  area,  the  subject  should  exploit  the  iM|>licil  infornation  created 
by  the  identification  of  the  surrounding  areas.  For  exanple,  if  the  unknown 
area  is  surrounded  by  an  area  identified  as  residential,  then  it  is  unlikely 
that  the  unknown  area  will  be  used  for  activities  that  are  considered  noxious 
in  a suburban  area  such  as  a stockyards  or  a racecar  track.  (An  exanple 
Geography  problen  is  shown  in  detail  in  the  Method  section.) 


Retrieval  fron  nenory  is  the  second  conponent,  and  in  the  Geography  task,  the 


subject  retrieves  hypothesized  uses  for  the  unknown  area  from  either  the 
explicit  information  provided  by  the  problem  or  the  implicit  information  that 
they  infer  from  the  explicit  information. 


17 


Uith  these  ideas  in  mind,  two  additional  versions  of  the  Alternate  Uses  Test 
were  designed  which  were  hoped  to  be  relatively  pure  measures  of  either  1) 
retrieval  of  the  properties  or  characteristics  of  an  object  or  2)  retrieval  of 
hypotheses  from  implicit  or  explicit  cues.  The  original  Alternate  Uses  test 
was  assumed  to  involve  both  components  to  some  degree.  A second  version  of 
Alternate  Uses  was  developed  which  measured  the  ability  of  the  subjects  to 
retrieve  the  implicit  dimensions  of  an  object.  The  third  version  of 
Alternate  Uses  provided  the  dimensions  of  the  object  to  the  subject,  thereby 
making  them  explicit,  and  so  measured  the  ability  to  retrieve  additional 
hypotheses  from  these  dimensions.  By  examining  the  extent  to  which  these  three 
versions  of  the  Alternate  Uses  test  predict  hypothesis  generation  performance, 
the  relative  importance  of  the  two  proposed  components  in  hypothesis  generation 
can  be  assessed. 

A second  reason  for  conducting  Experiment  2 uas  to  increase  the  reliability  of 
the  predictor  and  criterion  measures.  The  Alternate  Uses  test,  which  had  been 
administered  in  an  abbreviated  form  during  Experiment  1 was  doubled  in  length. 
Because  the  Alternate  Uses  test  was  found  to  predict  performance  as  well  as  a 
linear  combination  of  all  the  predictors  from  Experiment  1,  for  the  second 
experiment  the  remaining  creativity  tests  were  discarded  from  the  battery  of 
predictors.  The  Hypothesis  Generation  test  was  discarded  from  the  set  of 
criterion  variables  because  of  low  inter-item  correlations;  this  test  evidently 


was  not  a pur*  measure  of  hypothesis  generation.  The  renaming  criterion  test, 
the  Geography  task,  was  doubled  in  length,  and  ambiguities  in  some  items  were 
corrected. 
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A third  reason  for  conducting  Experiment  2 was  to  examine  several  potential 
predictors  that  were  not  included  in  Experiment  1 for  reasons  of  time.  These 
predictors  were  incorporated  into  Experiment  2 so  that  their  relationship  to 
hypothesis  generation  could  be  determined.  These  predictors  consisted  of 

measures  of  achievement  and  general  ability  and  measures  of  episodic  memory. 
It  was  decided  to  include  the  Information  scale  of  the  UAIS  as  a measure  of 
general  mental  ability  and  the  Verbal,  Quantitative,  and  Composite  scores  from 
the  ACT  test  as  measures  of  achievement.  It  is  possible  that  these  factors 
alone  could  account  for  hypothesis  generation  performance.  Although  creativity 
is  supposed  to  be  a process  that  is  unrelated  to  intelligence  (Anastasi,  1968), 
it  has  been  found  that  correlations  between  Alternate  Uses  and  other  measures 
of  general  intelligence  lie  between  .2  and  .3  (Guilford,  Christensen, 

Merrifield,  Uilson,  1978). 

The  decision  to  measure  episodic  memory  was  made  because  of  the  possibility 
that  subjects"  performance  on  Alternate  Uses  was  based  solely  on  their 
experiences,  and  that  this  experience  was  a limiting  factor  in  memory 
retrieval.  A search  of  episodic  or  situational  memory  (Tulving,  1972),  would 
lead  to  the  subjects  retrieving  object-uses  that  they  have  seen  implemented  in 
a particular  situation.  On  the  other  hand,  a search  of  semantic  memory,  which 

is  not  directly  based  on  past  experiences  with  the  object,  should  lead  to  the 

subjects  retrieving  object  uses  that  are  instead  created  from  a composition  of 
ideas  drawn  from  a more  general  memory  store.  A question  included  on  parts  One 


uni 


h> 


and  Three  of  the  Alternate  Uses  test  asked  the  subjects  to  indicate  whether 
they  had  used  or  seen  the  object  used  in  the  wo y they  hod  specified.  Good 
hypothesis  generators  should  use  both  semantic  and  episodic  retrieval.  For 
this  reason,  we  would  expect  that  people  who  do  well  on  the  Geography 
hypothesis  generation  task  would  show  little  dependence  on  episodic  memory 
relative  to  those  who  do  poorly. 

Finally,  the  original  test  of  information  about  the  Geography  problems  was 
discarded  and  another  was  created  in  which  fifteen  possible  hypotheses  were 
presented  for  each  of  four  of  the  Geography  problems,  and  subjects  were 
required  to  judge  whether  those  hypotheses  were  consistent  with  the  nap  and 

each  written  datun. 


Method 

Subjects.  The  101  subjects  included  in  the  experinent  were  introductory 
psychology  students  who  received  class  credit  for  participating  in  the 
experinent.  Ho  data  collected  in  this  experiment  were  discarded. 

t’^Vi'iVe  9f  the  tests  used  in  the  test  batitery.  There  were  four  categories 
of  tests  administered  to  the  subjects.  These  categories  included  1)  the 
criterion  measure  of  hypothesis  generation  performance  2)  Three  versions  of  the 
Alternate  Uses  Test  th^t  measured  retrieval  of  uses  from  implicit 
characteristics  of  the  objects,  retrieval  of  characteristics  of  the  objects, 
and  retrieval  of  uses  when  characteristics  of  the  objects  are  provided,  3) 
achievement  and  general  ability  tests,  4)  a test  of  information  about  the 
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measure 


1)  The  sole  criterion  Measure  was  the  Geography  task.  More  discussion  of  this 


neasure  is  now  warranted.  An  example  of  the  Geography  problems  can  be  seen  in 


figure  1.  The  naps  used  in  these  problems  were  copied  from  U.S.  Census  tracts 


The  additional  written  information  that  accompanied  each  m.*p  was  created  so 


that  it  was  ambiguous  enough  to  allow  many  hypotheses  to  be  consistent.  For 


this  example,  the  actual  location  in  figure  1 is  a county  fairgrounds;  however 


an  amusement  park,  indoor  arena  or  fieldhouse,  community  college,  civic  center 


or  convention  center,  park,  amphitheater  or  exposition  center  are  examples  of 


other  consistent  hypotheses.  The  expanded  version  of  the  Geography  task 


contained  eight  problems 


(insert  figure  1 about  here) 


The  Geography  task  was  scored  in  the  same  manner  as  it  was  in  Experiment  1 


All  responses  for  each  problem  were  listed,  then  two  experimenters  rated  each 


response  using  a three  point  scale.  Again  2 points  were  given  for  hypotheses 


that  were  consistent  with  all  the  data,  t point  was  given  for  hypotheses  that 


were  consistent  with  all  but  one  piece  of  data,  and  sero  points  were  given  for 


hypotheses  that  were  inconsistent  with  any  more  than  one  piece  of  data.  A 


subject's  score  on  the  Geography  task  was  the  sum  of  the  points  received  for 


each  hypothesis  for  all  problems 
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ADDITIONAL  INFORMATION  ABOUT  AREA  X 

1.  Area  X serves  a county-wide  area. 

2.  Seasonal  events  that  attract  a large  number  of  people  are  scheduled  in 
area  X. 

3.  A wide  variety  of  activities  take  place  in  area  X. 

Area  X serves  a definite  purpose.  Think  of  as  many  possible  uses  for 
area  X as  you  can  that  are  consistent  with  all  the  information  provided 
(including  the  map)  and  list  those  uses  below. 


Figure  1.  Sample  problem  from  the  Geography  hypothesis  generation  task 
used  in  Experiment  2. 
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subjects  to  nark  one  of  two  boxes  labelled  'Yes'  or  'No'  in  response  to  the 

question  "Have  you  ever  used  or  seen  the  object  used  this  wnyv”. 

Tuo  experinenters  independently  rated  responses  for  each  of  these  tasks.  For 
the  first  task,  a subject's  AU1  score  consisted  of  the  sum  of  the  appropriate 
responses  that  were  generated  for  all  problens.  The  episodic  MeMory  score 
<AU£1)  was  forned  by  conpuling  the  ratio  of  the  nunber  of  uses  on  which  the 

subject  said  'Yes'  in  answer  to  the  question  designed  to  Measure  episodic 

nenory  retrieval  to  the  total  nunber  of  appropriate  uses  for  all  problens. 

The  score  for  Alternate  Uses  2 (AU2)  consisted  of  the  sum  of  the  nunber  of 
appropriate  characteristics  generated  for  each  of  the  ten  problens. 

The  scoring  used  for  Alternate  Uses  3 (AU3)  was  similar  to  that  used  for  AU1  in 
that  it  consisted  of  the  sun  of  the  nunber  of  appropriate  uses  generated  for 
all  problens.  However  in  order  for  a response  to  be  considered  appropriate  on 
AU3  it  nust  not  only  be  a legitinate  use,  but  it  Must  also  be  different  fron 
all  responses  generated  for  the  corresponding  problen  on  AU1 . An  episodic 
nenory  score  identical  to  that  used  for  AU1  was  also  conputed  for  AU3. 

All  subjects  received  five  scores  for  this  series  of  tests.  These  scores  were 
an  Alternate  Uses  1 score  which  Measured  the  ability  to  retrieve  uses  using 
inplicit  characteristics  of  household  objects,  an  Alternate  Uses  1 Episodic 
nenory  score  which  Measured  the  extent  to  which  episodic  nenory  had  a role  in 
this  retrieval,  an  Alternate  Uses  2 score  which  Measured  the  ability  to 
retrieve  characteristics  of  the  objects,  and  an  Alternate  Uses  3 score  which 

Measured  the  ability  to  retrieve  uses  when  the  characteristics  were  explicit, 

■\\  V - ' i •*<"’  * 

and  an  Alternate  Uses  3 Episodic  nenory  score  which  Measured  the  use  of 
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episodic  nenory  in  the  Alternate  Uses  3 lest. 


3)  The  Measures  of  achievenent  and  general  ability  were  the  Infornation  scale 
of  the  UAIS  and  subjects  Verbal,  Quantitative,  and  Composite  scores  on  the  ACI 
test.  The  Infornation  scale  of  the  UAIS  consists  of  a series  of  questions 
Measuring  a subject's  general  knowledge  about  Ihe  world,  for  exanple  "Uhy  are 
dark  clothes  earner  than  light  clothes7  nr  "How  far  is  it  fron  Paris  to  New 
York7.  Twenty  five  of  the  questions  fron  the  UAIS  infornation  scale  were  used. 

A subject's  score  on  this  test  consisted  of  the  total  nunbor  of  correct 
answers  given  to  all  questions.  The  correct  answers  wore  obtained  fron  the 
UAIS  nanual  (Uechsler,  1V55). 

The  Verbal,  Quantitative,  «nd  Composite  scores  fron  the  ACT  test  were  chosen  to 
neasure  achievenent.  The  University  of  Oklahoma  requires  students  to  take  the 
ACT  test  in  order  to  be  ado it ted  to  the  University.  These  scores  are 
considered  to  be  predictors  of  grade  point  average,  a Measure  of  achievenent  in 
school.  Subjects  gave  written  pernission  during  the  expennent  for 
exper inenters  to  gain  access  to  their  ACT  scores  fron  school  files.  A)  The 
Geography  Infornation  test  was  used  to  neasure  the  anount  of  infornation 
possessed  by  subjects.  Problem  on  this  test  consisted  of  the  sane  nap  and 

additional  infornation  that  were  found  on  the  criterion  Geography  task. 
Fifteen  possible  hypotheses  about  each  problon  were  presented  to  subjects.  The 
task  of  a subject  was  to  indicate  whether  the  hypothesis  was  consistent  with 
each  individual  datun  (the  nap,  additional  infornation  1,  additional 
infornation  2,  and  additional  infornation  3)  by  writing  a 'Y'  or  'N'  in  a space 
next  to  the  hypothesis  that  was  under  a colunn  corresponding  to  the  particular 


datum.  Each  subject  thus  made  40  yes  or  no  responses  regarding  their  feeling 
of  consistency  or  inconsistency  about  the  fifteen  proposed  hypotheses.  Four  of 
this  type  of  problem  were  used  because  the  hypotheses  proposed  were  some  that 
had  been  generated  on  the  Geography  task  of  Experiment  1,  which  contained  only 
four  problems.  A subject's  score  on  this  test  consisted  of  the  total  number  of 
correct  judgments  about  consistency. 

l8SliD9  ClEOced'jre.  Groups  of  10  - 12  subjects  were  administered  the  tasks 
during  a single  tuo-hour  session.  The  tasks  were  administered  in  the  following 
orders  Alternate  Uses  Part  One  Mb  minutes),  Alternate  Uses  Part  Two  (10 
minutes,  Alternate  Uses  Part  Three  (15  minutes),  Geography  Test  (30  minutes), 
Geographical  Information  Test  (20  minutes). 

Results  and  Discussion 

ECSdiStSCS  of  hypothesis  generation  perform once.  A correlational  analysis 
was  performed  between  the  predictor  variables  and  the  hypothesis  generation 
criterion  to  assess  the  extent  to  which  each  predictor  variable  was  correlated 
with  the  criterion  variable.  These  results  are  shown  in  table  2. 


(Insert  table  2 about  here) 


By  far  the  best  predictor  of  hypothesis  generation  performance  was  the 
Alternate  Uses  test  (AU1)  which  had  a correlation  of  .51  (p  < .0001)  with  the 
Geography  test  (GE).  The  increase  in  this  correlation  (as  compared  to  the 


value  of  .2?  obtained  in  Experiment  1)  nu y be  attributed  to  the  increase  in 
reliability  of  both  Alternate  Uses  and  the  Geography  test.  lhis  result 
indicates  that  hypothesis  ■generation  performance  can  be  predicted  from  the 
Alternate  Uses  test,  which  is  a simple  "paper  and  pencil"  test  that  can  be 
administered  in  IS  minutes.  With  further  development  this  correlation  could 
undoubtedly  be  increased. 

Next  to  be  addressed  was  whether  the  Alternate  Uses  test  predicts  hypothesis 
generation  performance  over  and  above  other  potential  predictors.  The  UAIS 
Information  scale  correlated  .24  (p  < .016)  with  GE.  This  scale  was  chosen 
because  it  is  a good  predictor  of  general  mental  ability,  and  tests  of  general 
mental  ability  typically  correlate  in  the  .2  to  .3  range  with  tests  of 
creativity.  A partial  correlation  was  calculated  between  Alternate  Uses  1 and 
the  Geography  test  holding  the  UAIS  Information  score  constant.  This  partial 
correlation  is  a measure  of  the  relationship  between  Alternate  Uses  and 
hypothesis  generation  that  is  not  accounted  for  by  intelligence  or  general 
ability.  The  partial  correlator*  was  .286.  An  approximate  test  of  significance 
was  performed  on  that  partial  correlation,  and  was  found  to  be  significantly 
different  from  0.  This  result  suggests  that  the  Alternate  Uses  measures  the 
ability  to  generate  hypotheses  and  that  this  ability  is  different  from 
intelligence. 

The  relative  contribution  of  the  component  '.kills  in  hypothesis  generation. 
Several  versions  of  the  Alternate  Uses  test  were  created  which  reflected  the 
several  proposed  components  of  hypothesis  generation.  Alternate  Uses  2 (AU2) 
involves  generating  the  .implicit  dimensions  of  an  object,  while  Alternate 
Uses  3 (AU3)  is  a relatively  pure  measure  of  memory  retrieval  from  explicit 


retrieval  cue*.  These  tuo  versions  of  the  Alternate  Uses  test  should  separate 
the  component  abilities  in  the  original  Alternate  Uses  test.  The  "retrieval  of 
i up licit  dimensions"  component  of  Alternate  Uses  (AU2)  correlated  .24  (p  < 
.017)  with  hypothesis  generation  and  the  "retrieval  of  uses  from  explicit 
dimensions”  canponent  (AU3)  correlated  .49  (p  < .0001).  Clearly,  the  ability 
to  retrieve  information  efficiently  from  memory  accounts  for  wore  of  the 
hypothesis  generation  performance  than  does  the  ability  to  retrieve  dimensions, 
but  both  contribute  significantly  to  hypothesis  generation  performance. 

An  analysis  of  variance  was  used  to  get  a better  understanding  of  the  relative 
contribution  of  the  "retrieval  of  implicit  dimensions"  component  as  compared  to 
the  retrieval  of  uses  given  dimensions"  component.  Subjects  were  assigned  to 
high,  medium  and  low  AU2  groups  by  comparing  their  performance  on  AU2  to 
tertile  scores  which  partitioned  the  AU2  scores  into  three  equally  numerous 
groups.  Similarly,  subjects  were  divided  into  high  or  low  "retrieval  of  uses 
given  dimensions"  groups  by  comparing  their  performance  on  AU3  to  the  median 
score. 

A two  way  Analysis  of  Variance  with  these  Al)2  and  AU3  groupings  as  factors  was 
performed  with  the  Geography  score  as  the  dependent  measure.  Because  there  were 
an  unequal  number  of  subjects  in  each  cell,  the  method  of  nonorthogonal 
Analysis  of  Variance  suggested  by  Applebaum  and  Cramer  (1924),  in  which 
patterns  of  significance  are  examined,  was  used.  This  method  was  implemented 
on  the  statistical  package  SA3  in  the  manner  suggested  by  Herr  and  Gaebelien 
(1978).  Initially  the  interaction  was  nonsignificant,  so  the  main  effects 
could  then  be  examined  by  performing  "eliminating  and  ignoring"  tests  on  each 
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variable  of  interest.  The  test  of  AU3  eliminating  A 132,  which  tested  whether 
there  uas  any  evidence  of  an  AU3  effect  over  and  above  the  AU2  effect  present 
was  significant  (F  = 29.63,  p < .0001).  The  test  of  AU2  eliminating  AU3, 
indicating  any  evidence  of  the  effect  of  AU2  over  and  above  the  effect  of  AU3, 
was  marginally  significant  (F  * 3.06,p  < .0518).  While  AU2  has  an  effect,  «is 
witnessed  by  this  nar^inally  significant  result  and  its  significant 
correltation  with  GE  r * .24,  p < .017),  the  "retrieval  of  uses  given 
dinensions  “ conponent,  AU3  is  by  far  the  most  inportant  variable.  Because  no 
interaction  was  found,  it  can  be  concluded  that  both  components  combine 
additively.  However,  the  main  effect  results  indicate  that  the  perfornance  on 
AUJ  is  by  far  the  most  inportant  predictor.  The  contributions  of  the  two 
proposed  cognitive  conponents  to  hypothesis  generation  are  shown  in  figure  2, 
which  plots  the  neans  from  the  Analysis  of  Variance. 


(Insert  figure  2 about  here) 


As  can  be  seen  in  this  figure,  subjects  who  scored  high  in  both  components  are 
substantially  better  hypothesis  generators  than  subjects  who  scored  low  on  both 
conponents.  The  superior  hypothesis  generator  appears  to  be  skilled  in  both 
generating  the  inplicit  dimensions  of  a problem,  and  in  retrieving  hypotheses 
based  on  these  dimensions.  These  abilities  do  not  appear  to  be  related,  ns 
witnessed  by  the  failure  to  find  an  interaction  in  the  AN0VA,  and  by  the 
correlation  of  -.04  between  AU2  and  AU3  (see  table  2).  This  result  suggests 
that  these  two  conponents  are  independent  skills  or  abilities.  Of  the  two 


OKD  high  "retrieval' 

oo  LOW  "RETRIEVAL* 


hypothesis 

GENERATION 


ABILITY  TO  GENERATE  IMPLICIT  DIMENSIONS 


Figure  2.  Mean  hypothesis  generation  scores  of  subjects  who  scored  differently 
on  Alternate  (isos  2 and  Alternate  Uses  5.  Subjects  were  rated  low, 
medium,  or  high  in  the  ability  to  generate  implicit  dimensions  on 
the  basis  of  their  Alternate  Uses  2 score.  The  high  "retrieval" 
group  consisted  of  subjects  who  scored  above  the  median  on  Alternate 
Uses  3,  while  the  low  "retrieval"  group  consisted  of  subjects  who 
scored  below  the  median  on  Alternate  Uses  3. 
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conponents,  retrieval  of  uses  -given  dimensions  seems  to  be  the  most  important, 
but  both  conponents  contribute  significantly  to  performance. 

The  retrieval  of  hypotheses  from  memory  probably  does  not  depend  on  episodic 
menory,  since  the  correlations  between  the  episodic  memory  measures,  AUE1  and 
AUE3,  and  hypothesis  generation  perfomance  are  -.04  and  -.05,  respectively. 
These  results  suggest  that  the  ability  to  retrieve  hypotheses  is  not  heavily 
dependent  on  personal  experience  with  these  hypotheses  in  the  past.  A good 
hypothesis  generator  draus  on  the  personal  experiences  in  episodic  memory,  and 
on  general  information  in  semantic  memory. 

Finally,  there  was  a significant  correlation  between  the  Geography  Information 
score  and  the  criterion  (r  = .21 , p < .04).  This  implies  that  our  measure  of 
the  amount  of  information  possessed  about  a problem  has  a weak  relationship  to 
hypothesis  generation  performance.  It  may  be,  however,  that  the  test  of 
information  measured  the  subject's  hypothesis  assessment  abilities  in  judging  a 
degree  of  consistency  or  inconsistency  rather  than  measuring  information  alone. 
To  test  this,  a revised  Geography  score  was  computed  that  consisted  of  the 
proportion  of  good  responses,  those  receiving  2 points,  to  the  total  number  of 
responses  generated  by  each  subject.  The  correlation  of  the  GI  score  with  this 
preditor  was  .27,  a slight  gain  in  predictive  ability.  It  is  more  likely  that 
this  task  was  too  difficult  for  the  subjects,  since  it  forced  them  to  examine 
the  consistency  of  implicit  characteristics  in  an  explicit  situation.  The 
Geography  Information  task,  therefore,  was  probably  not  a pure  measure  of 
information  possessed  by  subjects.  The  correlations  between  GI  and  the  UAIS 
Information  score  Cr(GI,UAIS)  * .34,  p < .00053,  and  GI  and  the  ACT  composite 
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score  Cr(GI,ACTC)=  .42,  p < .00013  provide  additional  evidence  for  this 

conclusion. 


Conponent  analysis  of  the  predictors  of  hypothesis  generation  performance.  A 
principal  conponents  analysis  was  used  as  a technique  for  summarizing  the 
results  of  Experiment  2 to  obtain  a wore  global  view  of  the  results.  A 
principal  conponents  analysis  was  performed  on  the  intercorrelation  matrix  of 
the  ten  predictors  and  the  Geography  task.  The  purpose  of  this  analysis  was  to 
sunnarize  the  nature  of  the  abilities  underlying  the  individual  differences 
produced  by  these  11  Measures.  Specifically,  we  anticipated  finding  that  the 
ability  or  abilities  encompassing  the  individual  differences  in  hypothesis 
generation  would  be  unrelated  to  the  abilities  Measured  by  the  general 
achievement-ability  tests  (ACT-V,  ACT-Q,  ACT-C  and  UAIS-INFO).  Also  we  hoped 
that  the  components  structure  would  help  clarify  the  nature  of  the 
relationships  between  the  hypothesis  generation  task  and  the  various  measures 
based  on  the  Alternate  Uses  test  (AU1,  AUE1,  AU2,  AU3,  and  AUE3). 

Three  principal  components  were  extracted  that  accounted  for  62  percent  of  the 
trace  of  the  intercorrelation  matrix.  The  initial,  unrotated  solution  did  not 
support  a single  factor  solution;  consequently,  the  three  dominant  components 
were  rotated  both  orthogonally  (Varinax)  and  obliquely  (Promax).  Inspection  of 
the  orthogonal  and  oblique  solutions  clearly  indicated  an  orthogonal  structure 
(i.e.,  uncorrelated  components). 

The  Varimax  component-structure  matrix  is  shown  in  Table  3.  In  Table  3 
component  loadings  (correlations  between  observed  variables  and  components) 
less  than  .2  in  absolute  value  have  been  suppressed.  The  resulting  components 
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structure  for  the  11  variables  has  an  unusually  clear  and  unequivocal 
interpretation.  Conponent  1 is  defined  by  the  three  ACT  neasures,  the  UAIS 
Infcrnation  test  and  the  Geographical  Infornation  test.  This  first  conponent 
we  chose  to  call  General  Ability^Achi<?ve«ent.  Notice  that  neither 
hypothesis  generation  nor  the  Alternate  Uses  neasures  have  any  appreciable 
correlation  with  the  General  Ability-Achievenent  conponent.  Conponent  2 is 
defined  by  AU1  (the  Alternate  Uses  Test  adninistered  under  original 
instructions),  AU3  (the  Alternate  Uses  Test  in  which  subjects  generated  uses  of 
properties  of  the  objects  as  well  as  a list  of  object  properties  explicitly 
provided  by  the  experinenters)  and  GE  (the  total  hypothesis  generation  score). 
Fron  the  nature  of  the  three  variables  correlating  nost  highly  with  Conponent 
2,  it  is  tentatively  labeled  Hypothesis  Retrieval.  The  observed  variables 
ADI,  AU3  and  GE  all  require  a search  oT  senantic  nenory  for  ideas  consistent 
with  certain  inforiation  or  data  provided  to  the  subjects.  The  Hypothesis 
Retrieval  conponent  clearly  does  not  subsune  the  ACT  neasures  or  the  episodic 
nenory  abilities  tapped  by  AUE1  and  AUE3. 


(Insert  table  3 about  here) 


Ue  suggest  that  Conponent  3 be  naned  Episodic  Menory  since  the  observed 
variables  correlating  highest  with  this  conponent  are  AUE1  and  AUE3  (the 
neasures  of  episodic  nenory  derived  fron  the  Alternate  Uses  test).  Recall  that 
AUE1  and  AUE3  are  sinply  the  proportion  of  uses  listed  by  subjects  in  AU1  and 
AU3  that  had  actually  been  seen  inplenented  in  real  situations.  Notice  also 
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TABLE  3 

PRINCIPAL  COMPONENTS  ANALYSIS  OF  EXPERIMENT  2 
VARINAX  ROTATED  FACTOR  MATRIX 


FACTOR  1 

GEN.  ABILITY- 
ACHIEVEMENT 

FACTOR  2 

HYPOTHESIS 

RETRIEVAL 

FACTOR  3 

EPISODIC  MEMORY 

ACTV 

.81 

• 

» 

ACTQ 

.80 

• 

* 

ACTC 

.V4 

• 

♦ 

UAIS 

.76 

• 

* 

61 

.53 

* 

* 

6E 

• 

.80 

* 

AUt 

• 

.84 

t 

AU3 

* 

.79 

* 

AU2 

* 

.31 

.34 

AUE1 

♦ 

• 

.86 

AUE3 

* 

• 

.82 

* CORRELATIONS  LESS  THAN  .2  HAVE  BEEN  SUPPRESSED 
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that  AU2,  a measure  of  the  ability  to  discern  the  basic  properties  of  objects, 
correlated  moderately  urth  component  3.  It  nay  be  that  the  determination  of 
the  properties  of  objects  draws  more  heavily  upon  episodic  memory  as  opposed  to 

semantic  memory. 

The  three  component  principal  components  analysis  supported  our  prediction  that 
hypothesis  generation  skills  are  not  significantly  related  to  general 
ability-achievement.  Furthermore,  the  results  of  this  analysis  suggest  that 
the  ability  common  to  successful  performance  on  the  hypothesis  generation  task 
and  the  Alternate  Uses  test  may  involve  efficient  searches  where  semantic 
memory  is  well  developed  and  thoroughly  searched. 

Summary 

The  purpose  of  these  two  experiments  was  to  begin  an  inquiry  into  individual 
differences  in  hypothesis  generation.  The  first  experiment  was  a survey 
designed  to  identify  possible  predictors  of  hypothesis  generation  performance. 
In  this  experiment,  several  tests  of  creativity  which  were  believed  to  be 
related  to  hypothesis  generation  performance  were  examined,  as  well  as  several 
other  predictors  measuring  inductive  reasoning  and  information.  It  was  found 
that  only  the  Alternate  Uses  test,  which  measures  creative  thinking,  was 
consistently  related  to  hypothesis  generation. 

A second  experiment  was  performed  to  examine  the  Alternate  Uses  test  more 
closely,  and  to  examine  other  potential  predictors  of  hypothesis  generation 
including  general  mental  ability,  achievement,  and  the  relevance  of  episodic 
memory  retrieval.  It  was  proposed  that  there  are  several  components  to 
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successful  hypothesis  generation,  one  in  which  implicit  dimensions  of  the 
object  are  retrieved,  and  one  in  which  the  hypothesis  is  retrieved  using  data 
that  is  explicitly  provided.  Various  versions  of  the  Alternate  Uses  test  were 
constructed  to  Measure  these  components,  and  results  were  obtained  which 
suggest  that  the  component  of  retrieving  hypotheses  when  the  information  is 
provided  is  more  important  than  retrieval  of  implicit  dimensions,  but  that  both 
factors  influence  performance  through  an  additive  relationship. 

A components  analysis  suggested  that  general  mental  ablity  and  academic 
achievement  are  only  weakly  related  to  hypothesis  generation  and  that 
hypothesis  generation  ability  cannot  be  predicted  adequately  from  these 
varibles  alone.  Rather,  hypothesis  generation  ability  seems  to  be  more  related 
to  the  ability  to  search  memory  effectively,  and  hypothesis  generation 
performance  can  be  predicted  from  *he  Alternate  Uses  test  which  measures  this 
ability.  The  results  also  suggest  that  an  individual's  episodic  memory  is  not 
the  only  source  of  hypotheses;  that  hypotheses  are  retrieved  from  both  episodic 
and  semantic  memory. 

A tentative  picture  of  hypothesis  generation  performance  is  emerging  from  these 
results.  As  in  other  studies  (Gettys  and  Fisher,  in  press,  Gettys,  Fisher,  and 
flehle,  1978)  the  critical  variable  appears  to  be  primarily  whether  information 
that  is  stored  in  memory  can  be  accessed.  Hypothesis  generation  will  not 
succeed  unless  this  information  can  be  retrieved,  and  evidently  the  most 
important  variable  is  the  efficiency  of  this  retrieval  process. 
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